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Extending a given language with new dedicated features is a general and quite used approach to make 
the programming language more adapted to problems. Being closer to the application, this leads 
to less programming flaws and easier maintenance. But of course one would still like to perform 
program analysis on these kinds of extended languages, in particular type checking and inference. In 
this case one has to make the typing of the extended features compatible with the ones in the starting 
language. 

The Tom programming language is a typical example of such a situation as it consists of an extension 
of Java that adds pattern matching, more particularly associative pattern matching, and reduction 
strategies. 

This paper presents a type system with subtyping for Tom, that is compatible with Java's type system, 
and that performs both type checking and type inference. We propose an algorithm that checks if all 
patterns of a Tom program are well-typed. In addition, we propose an algorithm based on equality 
and subtyping constraints that infers types of variables occurring in a pattern. Both algorithms are 
exemplified and the proposed type system is showed to be sound and complete. 



1 Introduction of the problem: static typing in Tom 

We consider here the Tom language, which is an extension of Java that provides rule based constructs. 
In particular, any Java program is a Tom program. We call this kind of extension formal islands El[3) 
where the ocean consists of Java code and the island of algebraic patterns. For simplicity, we consider 
here only two new Tom constructs: a °/ match construct and a ' (backquote) construct. 

The semantics of %match is close to the match that exists in functional programming languages, but 
in an imperative context. A 7 match is parameterized by a list of subjects {i.e. expressions evaluated 
to ground terms) and contains a list of rules. The left-hand side of the rules are patterns built upon 
constructors and fresh variables, without any linearity restriction. The right-hand side is not a term, but a 
Java statement that is executed when the pattern matches the subject. However, thanks to the backquote 
construct (') a term can be easily built and returned. In a similar way to the standard switch/case 
construct, patterns are evaluated from top to bottom. In contrast to the functional match, several actions 
{i.e. right-hand sides) may be fired for a given subject as long as no return or break instruction is 
executed. To implement a simple reduction step for each rule, it suffices to encode the left-hand side 
with a pattern and consider the Java statement that returns the right-hand side. 

For example, given the sort Nat and the function symbols sue and zero, addition and comparison of 
Peano integers may be encoded as follows: 
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public Nat plus (Nat tl, Nat t2) { 
°/„match(tl,t2) { 

{ return 'x; } 
{ return 1 sue (plus (x,y) ) : 



x,zero() 
x , sue (y) 



} 



public boolean greaterThan(Nat tl. 
y,match(tl, t2) { 



Nat t2) { 



x,x -> { return false; } 

suc(x) ,zero() -> { return true; } 

zero (), sue (y) -> { return false; } 

suc(x) , sue (y) -> { return 'greaterThan(x.y) ; } 



In this combination of an ocean language (in our case Java) and island features (in our case abstract data 
types and matching), it is still an open question to perform type checking and type inference. 

Since we want to allow for type inclusion at the pattern level, the first purpose of this paper is to 
present an extension of the signature definition mechanism allowing for subtypes. In this context we 
define Java-like types and signatures. Therefore the set of types is the union of Java types and abstract 
data types (i.e. Tom types) where multiple inheritance and overloading are forbidden. For example, given 
the sorts Int + , Int~, Int and Zero, the type system accepts the declaration Int + <: Int A Int~ <: Int 
but refuses the declaration Zero <: Int + A Zero <: Int - . Moreover, a function symbol sue cannot be 
overloaded on both sorts Int + and Int - . In order to handle those issues, we propose an algorithm based 
on unification of equality constraints lfl4ll and simplification of subtype constraints (HQ] 021. ^ infers 
the types of the variables that occur in a pattern (x and y in the previous example). Moreover, we also 
propose an algorithm that checks that the patterns occurring in a Tom program are correctly typed. 

Of course typing systems for algebraic terms and for rewriting has a long history. It includes the 
seminal works done on OBJ, order-sorted algebras ifTUl 191 and Maude |6]; the works done on feature 
algebras or on membership constraints ifTTl |3 ; and the works on typing rewriting in higher-order 
settings like iTTTl or Q. Largely inspired from these works, our contribution here focusses on the 
appropriate type system for pattern-matching, possibly modulo associativity, in a Java environment. 



2 Type checking 

Given a signature the (simplified) abstract syntax of a Tom program is as follows: 

rule '.'.= cond — > action 

cond ..= term\ -4<u terni2 \ cond\ Acond2 

term ::= x \ f{terni\, . . . ,term n ) 

action ::= (term\, . . . ,term n ) 

The left-hand side of a rule is a conjunction of matching conditions termy temi2 consisting of a pair 
of terms and where s denotes a sort. We introduce the set J£~ of free symbols. Terms are many-sorted 
terms composed of variables x G 3£ and function symbols / G The set of terms is written ^(3?, 3£). 
In general, an action is a Java statement, but for our purpose it is enough to consider an abstraction 
consisting of terms e\,...,e n G 3F{^,3ts) whose instantiations are described by the conditions, and 
used in the Java statement. 

Example 2.1. The last rule of the greaterThan function given above can be represented by the following 
rule expression: 

suc(x) t\ f\suc(y) H<pN] ?2 — > (x,y) 

In a first step, we define S" as a set of sorts and we consider that a context T is composed of a set of 
pairs (variable,sort), and (function symbol,rank): 

r::= | Y\ UT2 | x: s | /: si,...,s n -^s 
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and context access is defined by the function sortOf (T,e) : T x 2F(JP , 3£) — > y which returns the sort 
of term e in the context T: 

sortOf (r,jc) = s, if x : s G T sort Of (T,f(ei, ...,e n )) — s, if / : s\ , . . . ,s„ — > s G T 

where x G X and / G 

We denote by Y(x : 5) the fact that x : 5 belongs to T. Similarly, T{f : s\, . . . ,s n — > s) means that 
/ : si, ■ ■ . , s„ — >• s belongs to T. In Fig. Q] we give a classical type checking system defined by a set of 
inference rules. Starting from a context T and a rule expression 71, we say that 71 is well-typed if 7T : wt 
can be derived by applying the inference rules, wt is a special sort that corresponds to the well-typedness 
of a rule or a condition cond. 



r h e\ : s\ ... rhe„:s„ 
T-Var =— — — — — ^ — t-Fun 



F(x:s)\-x:s ' T(f ; si, . . ., s„ -> s) \- f(ei, ... ,e n ) ; s 

r h e\ : s F\- e 2 : s Th (cond\ ) :wt ... T h (cond n ) : w? 

- , , : T-MATCH — —, : T-CONJ 

1 h (e\ e2)'wt 1 h [cond\ A . . . t\cond n ) : wt 

r h (cond) \wt T\- e\\ s\ ... T\- e n : s n 



r h (cond — > (ei,. . . ,e„)) : 

if sortOf (T,e,-) = J;, for / G [l,n 



T-RULE 



Figure 1 : Simple type checking system. 



2.1 Subtypes and associative-matching 

In order to introduce subtypes in Tom, we refine 5? as the set of sorts, equipped with a partial order <:, 
called subtyping. It is a binary relation on 5? that satisfies reflexivity, transitivity and antisymmetry. 
Moreover, since we allow for some symbols to be associative, we introduce the set & v of variadic sym- 
bols to denote them. Now, the set of terms is written 3?(^\J^ V , 3£) and terms are many-sorted variadic 
terms composed of variables x G 3£ and function symbols / G & U J^ v . In the following, we often write 
I a variadic operator and call it a list. 

We extend matching over lists to be associative. Therefore a pattern matches a subject considering 
equality relation modulo flattening. Lists can be denoted by function symbols £ G JF V or by variables 
x G SC annotated by *. Such variables, which we write x* , are called star variables. So we consider in 
the following many-sorted variadic terms composed of variables x G 3£ , star variables x* (where x G X) 
and function symbols / G U JF V . Moreover, we define that function symbols I G & v with variable 
domain (since they have a variable arity) of sort s\ and codomain s are written i : *i* — >• s while star 
variables x* are also sorted and written x* : s. 

Since terms built from syntactic and variadic operators can have the same codomain, we cannot 
distinguish one from the other only by theirs sorts. However, this is necessary to know which typing rule 
applies. Moreover, an insertion of a term can be treated by two ways: given terms £(e\),£(e2),£i(ei) G 
UJ^v, X) where £,£\ G we have: 1) an insertion of a list £(e\) into a list £(ej) corresponds 
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to a concatenation of these both lists resulting in £{e\ , e-i) ; 2) an insertion of a list £\ [e\ ) into a list £{e2) 
results in t(ii{e\),ei)- For that reason, it is important to distinguish the list from the inserted term by 
its function symbol in order to define which typing rule concerned for list must be applied. For this 
purpose, we introduce a notion of sorts decorated with function symbols, called types, to classify terms. 
The special symbol ? is used as decoration when it is not useful to know what the function symbol is, 
i.e. when the expected type is known but not the expected function symbol. This leads to a new set of 
decorated sorts S> which is equipped with a partial order <: s . It is a binary relation on Ql where sf <: s s 8 2 
is equivalent to s\ <; s 2 A (gi = g 2 V g2 =?)■ 

As pointed out in the introduction, we assume in all that paper that the signatures considered do not 
have multiple inheritance and that we do not allow function symbol overloading. 

Given these notions, we refine the notion of context T as a set of subtyping declarations (type,type) 
and pairs (variable,type), and (function symbol,rank). This is expressed by the following grammar: 

r ::= | Ti UT 2 | s\ h s s\ I x : s 8 \x* : J | / : s\, . . . ,s n n -»■ / \£ : (s\)* ->■ J 

where k: s corresponds to the reflexive transitive closure of <: s and context access is refined by the func- 
tion sortOf (r,e) : T x 2?{J^ U J^ v , 3£) — > & which returns the type of term e in the context T: 

sortOf (r,*) = s g , ifx:s g £F sortOf (r,f(ei,...,e n )) = if / : s\, .. . ,s\ — )■ & T 
sortOf(r,x*) = /, ifx*:/er sortOf(r,£(e h ...,e n ,e)) = s e , if i : (s\ )* -> / G T 

where x G X, f G JF, £ € g G & U J^y U {?} and s\sf,s g ,s e G 9. 

The context has at most one declaration of type or signature per term since overloading is forbidden. 
This means that for e G 3f(&U& v , ,<%) and sf,s 8 2 2 (where g u g 2 G ^"U^ V U{?} and sf,sf G 9f) if 
e : sf G T and e : sf 2 G T then sf = s 8 2 2 . We denote by T(s\ 1 <: s s 8 2 2 ) the fact that sf <: s s s 2 2 belongs to T. 

2.2 Type checking algorithm 

In Fig. [2] we give a type checking system to many-sorted variadic terms applying associative matching. 
The rules are standard except for the use of decorated types. The most interesting rules are those that 
apply to lists. They are three: [T-Empty] checks if a empty list has the same type declared in T; [T-Elem] 
is similar to [T-Fun] but is applied to lists; and [T-Merge] is applied to a concatenation of two lists of 
type / in T, resulting in a new list with same type /. 

The type checking algorithm reads derivations bottom-up. Since the rule [Sub] can be applied to 
any kind of term, we consider a strategy where it is applied iff no other typing rule can be applied. In 
practice, [Sub] will be combined with [T-Var], [T-Fun] and [T-Elem] and the type s\ which appears in 
the premise will be defined according to the result of function sortOf (r,e). The algorithm stops if it 
reaches the [T-Var] or [T-SVar] cases, ensuring that the original expression is well-typed, or if none of 
the type checking rules can be applied, raising an error. 

Example 2.2. Let T = {£ : (Z ? )* -> Z l ,one : W me ,x* : 7L l ,z* : Z*,y : Z ? ,N ? <: s I 1 }. Then the expres- 
sion £(x* ,y,z*) -^<[z-] £(one()) — > (y) is well-typed and its deduction tree is given in Fig. \3\ 

3 Type inference 

The type system presented in Section |2] needs rules to control its use in order to find the expected deduc- 
tion tree of an expression. Without these rules it is possible to find more than one deduction tree for the 
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T(x : s 8 )\- x : s g ^ ^ AR 
where g G J^U JyU{?} 

r h ei : s\ 



h/(ei,...,e„) : s f 



T-Fun 



rh£(e 1 ,...,e„):s e The-.s] 

r(£:(slf ^s i )\-e(e 1 ,...,e n ,e):s' 
if sortOf (T,e) ^ / and e^x* 



T-Elem 



r(x*:/)hx*:/ 



T-SVar 



r(£:(s\)*->/)h£(): 



T-Empty 



rh£(e u ...,e„):s e The:/ 

r(£:(sl)*^s e )h£(e u ...,e n ,e):s i 
if sortOf (r,e) = / 



- T-Merge 



r h e : s\ 



.SI 



r( 5 f <: s s*) h e : SUB 
where € ^U^" V U{?} 

r h ei : i ? r h e2 ■ s 



r h (e] -«r,?| : wf 



T-Match 



- Gen 

rhe:s ? 

if sortOf (r,e) = s h , where he.^U & v 



r h (cond\) : wf 



r h (cond n ) : wf 



r h (cond) : wt T \- e\ : s 



r h (cond\ A ... A cond„) : wf 
The,,: 4" 



T-Conj 



T-RULE 



r h (cons' — >■ (ei, . . . ,e„)) : wf 
if sortOf (r,e,) =sf , where g t G ^U^,U{?} for / G [l,n] 



Figure 2: Type checking rules. 



same expression. For instance, in Example 12.21 the rule [Sub] can be applied to the leaves resulting of 
application of rule [T-Var]. The resulting tree will still be a valid deduction tree since the variables in 
the leaves will have type N ? instead of type Z ? declared in the context and N ? <: s 1? . For that reason, we 
are interested in defining another type system able to infer the most general types of terms. We add type 
variables in the set of types (defined up to here as a set of decorated sorts) to describe a possibly infinite 
set of decorated sorts. The set of types £7y pe (3tU {wt}, Y) is given by a set of decorated sorts 9, a set of 
type variables Y and a special sort wt: 

x ::= a | s s \ wt 

where x € & ype (@ U {wt}, f),aef,g£#U# v U{?} and s 8 € 9. 

In order to build the subtyping rule into the rules, we use a constraint set C to store all equality and 
subtyping constraints. These constraints limit types that terms can have. The language ^ is built from 
the set of types £7y pe (S}(J {wt},Y) and the operators "=/' (equality) and "<:/' (extension to ^y pe (9L} 
{wt},Y) of the partial order defined in Subsection 12. II ): 

C '.'.= X\ = s X2 | X\ <: s X2 

where cetf, Ti,T 2 € ^y pe (9U {wt}, Y). 

A substitution a is said to satisfy an equation X\ = s X2 if CTTi 
subtype relation X\ <: s X2 if <JXi <: s 0X2- 

Thus, a is a solution for C if it satisfies all constraints in C 
denotes the set of type variables in C. 



= 0X2- Moreover, a is said to satisfy a 
. This is written a |= C. The set Y(C) 
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ThiQ : 71 



■ T-Empty 



rh«(i*) : 



■ T-SVAR 
- T-Merge 



r\-y: 



■ T-VAR 
- T-ELEM 



ri-z* 



it 



Tht(x"-,y,z') : 



■ T-SVar 

- T-MERGE 

- T-GEN 



r h IQ : % l 



T-EMPTY 



rh oneQ :N°" 
rhoneQ :N ? 
T h oneQ : I? 



Thl(one()):T 



r\-e(one()):Z : 



Th{Z(x*,y,z*)^ m i{one[))):wt 



T-FUN 

- Gen 

- Sub 

- T-Elem 

- T-Gen 

- T-Match 



rh- 



rh (l(x*,y,z*) -« [z7] l{oneQ) — > (>•)) : wt 



■ T-Var 

- T-RULE 



Figure 3: Type checking example. 

Constraints are calculated according to the application of rules of type inference system (see Fig. HJ) 
where we can read the judgment r h cf e : T«C as "the term e has type T under assumptions T whenever 
the constraints C are satisfied". More formally, this judgment states that V<7 . (a \=C — ^ ar h e : ar). 

3.1 Type inference algorithm 

In Fig. |4] we give a type inference system with constraints. In order to infer the type of a given expres- 
sion 71, the context T is initialized to: 1) subtype declarations of the form s\ <: s s\ where s\ and s\ £ 
2) a pair of the form (f : s\ , . . . , — > s?) for each syntactic operator / occurring in 71 where sj , s* E & for 
i € [l,n]; 3) a pair of the form (£ : s\* — > /) for each variadic operator £ occurring in k where s\,/ G @; 
4) a pair of the form (x : a) for each variable x occurring in % where a G "V is a fresh type variable; 5) a 
pair of the form {x* : a) for each star variable x* occurring in % where a € "V is a fresh type variable. 
Moreover, each type variable introduced in a sub-derivation is a fresh type variable and the fresh type 
variables in different sub-derivations are distinct. As in Section 12.21 we explain the rules concerning 
lists: [CT-Empty] infers for an empty list £() a type variable a with the constraint a = s , s given by 
the rank of £; [CT-Elem] treats applications of lists to elements which are neither lists with the same 
function symbol nor star variables; [CT-Merge] is applied to concatenate two lists of same type s , and 
[CT-Star] is applied to concatenate a list and a star variable of the same type /. 

Example3.1. LetT = {£ : (Z ? )* -> Z e ,one : -> W ne ,x* : a u y : a 2 ,z* : a 3 ,N ? <:,. Then the expres- 
sion £(x*,y,z*) -^[04] £(one()) — > (y) is well-typed and the deduction tree is given in Fig. \5\ 

3.2 Constraint resolution 

In Fig.[6]we propose an algorithm to decide whether a given constraint set C has a solution, where gi ,g2 € 
& U #y U {?}. We denote by s 8i <: s s' g2 G T the fact that there exists s\,... ,s n such that s 1 <: s\ G T, 
s\ <: s\ G r, . . . , s' n <: s n G T and (g\ = g2 or g2 =?). If the algorithm stops without failure then C is said 
to be in solved form. 

While solving a constraint set C we wish to make sure, after each application of a constraint resolution 
rule, that the constraint set at hand is satisfiable, so as to detect errors as soon as possible. Therefore we 
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CT-Var t - r r CT-SVar 



T(x : t) \- ct x : a • {a = s t} T(x* : a\ ) h cf : a • {ai =. v a} 

n- d ei :ai«Ci ... rh l( c„:a„«C„ 



CT-Fun 



r(f:s\,...,sl-> s f ) h rt , . . . ,e„) : a • {a =, Q Q U {a,- sj} 

i=i 



CT-Empty 



r(^:(^)*^/)h rt £():«. {«=,/} 

rh ct £(e u ...,e n ) : a»Ci r h cf e : a l *C 2 

T{i: (s\)* ->■/) \-ct£(ei,...,e n ,e) : a»{a = 4 ./,(*i <:,. s\} UCi UC 2 
if sortOf (r» ^ / and e ^x* 



CT-Elem 



T \- ct t(e\ , . . . , e n ) : a • C\ T\- Ct e : a • C 2 

CT-Merge 

r(£:(s\)* -+/)\- a l(e h ...,e n ,e): a*{a= s /}UC 1 UC 2 

if sortOf (r» =/ 

T\- Ct &(e\,...,e n ) : a»Ci T\- Ct x* : a»C 2 
CT-Star 

r(i : (jj)* -> /) h ct t(e x , . . .,e n ,x*) = s /}UC, U C 2 

r \- ct e\ : ai • C\ Y \- ct e 2 : a 2 • C 2 



e 2 ) : wf«{ai <: s T,a 2 = s t}UCi UC 2 
r h cf {cond\ ) : wf • Ci ... r h c; (cond n ) :wt» C n 

n 

r h rt (cond\ A ... A condn) : wt • U C, 



CT-Match 



CT-Conj 



T h rt (co«(i) : • C co „^ T h rf ei : Ti • C\ ... T \- ct e n : t„ • C„ 

1 CT-RULE 

(cond — > (ei,...,e„)) : wf«C com / Q Q 

(=i 

if sortOf (!>,-) = Tf, for i G [l,n] where T, G ^ pe (^U{wf},7 / ) 



Figure 4: Type inference rules. 



must combine the rules for error detection and constraint resolution in order to keep C in solved form. The 
rules for the constraint resolution algorithm are provided in Fig. |7J where g,g\ ,g 2 G & U J^" v U {?}. The 
rules (1)-(14) are recursively applied over C. More precisely, rules (l)-(3) work as a garbage collector 
removing constraints that are no more useful. Rules (4) and (5) generate a. Rules (6) and (7) generate 
more simplified constraints. Rules (8)-(12) generate a and simplified constraints by antisymmetric and 
transitive subtype closure. Rules (13) and (14) are applied when none of previous rules can be applied 
generating a new a from a constraint over a type variable that has no other constraints. The algorithm 
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D- ct £Q:a s »C 3 = {a 5 = s Zf} 



■ CT-EMPTY 



r h c , x* : a$ »C4 = {0(5 =, a.\ } 



T h a l(x*) :a 5 *C 2 = {a 5 =, Z 1 } UC 3 UC 4 



CT-SVAR 
CT-Star 



rh c( ;y: a 8 «C 5 ={a 8 = s 02} 



T h c , <?(**,>•) : a 5 .Ci = {a 5 =,, Z f , a 8 <: s Z 7 } UC 2 UC 5 



■ CT-Var 

CT-ELEM 



r h„ z* : a s »C 6 = {a 5 = s a 3 } 



rh a £():a 6 »C 7 = {a 6 = s X} 



rh fl f(j*,y,z*) : a 5 .C p = {a 5 =,Z'}UC, UC 6 
(1) 

CT-Empty 



CT-SVAR 
CT-Star 



rh„ £(one()) : a 6 .C s = {a 6 = s Z £ ,a 7 <: S Z ? }UC 7 UC 8 
(2) 



CT-FUN 
CT-ELEM 



(1) 



(2) 



rh a (l(x*,y,z*) -«[«,] t-(one())) : wt*C cond = {a 5 <:,, 04,0(6 =, 04} 



CT-Match 



r h c/ v : ag • Cio = {«9 = s 02} 



ri-c (f(y,y,z*) -« [a(] £(one()) — ► M) : wt * c > = {«2 =» "2} UC co „ rf UC10 

Figure 5: Type inference example. 



CT-Var 

CT-RULE 



(1) 


{sf <-. s a, a <:, ; jf 2 }WC' = 


/ai/if jf 1 <: s jf 




(2) 




/aiZ if . (jf 1 <:. , 5 ? G T A sf <: s 


s ? er) 


(3) 


{a sf'>« <:.v4 2 } tt)C/ = 


/a«7 if (jf 1 <:, if 2 ^ r A 4 2 <: s sf 1 




(4) 


{jf <:,jf}UC 


=> fail if sf <: s sf <£T 




(5) 


{sf =sf}\±)C 


=> /ai/ if ji 7^ j 2 Vgi 





Figure 6: Rules for detection of errors in a constraint set C. 



stops if: a rule returns C = 0, then the algorithm returns the solution a; if C reaches a non-solved form, 
then the algorithm for detection of errors returns fail; or if C reaches a normal form different from the 
empty set, then the algorithm returns an error. We say that the algorithm is failing if it returns either fails 
or an error. 

Example 3.2. LetT = {£ : (Z ? )* -> Z e ,one : ->• N°" e ,x* : Oi,y : a 2 ,z* : a 3 ,N ? <: s Z ? } andC cond = {a 5 =,. 
Z ? ocio = .s oti) 0C5 =5 Z jOCio =.s ^,cc<) = s 0:2,0:5 =j Z ,059 <: 4 Z ? ,o: 8 = s 0:3,0:5 = s Z^,o: 8 = s . Z ,0£6 = s 
Z f , a-] = s W ne ,OC6 = s Z f , a-/ <: s Z ? , 0:5 <: s 0:4, ag = s 0:4, 0:2 =s 0-2} from the Example UJ] Let o = and 
C = Camd- The constraint resolution algorithm starts by: 

1. Application of sequence of rules (4), (1) and (5) generating {0:2 < : .v Z ? ,N°" e <: s Z ? ,Z^ <: s UC 
and {a 5 1 y Z e , a\o >-»■ OL\ , 0:1 r-> Z £ , 0:9 h-> a 2 , a 8 1-» a 3 , a 3 i-» Z e , ofe i-> Z* 9 , a 7 i-> N°" e , a 4 i-> Z^} U 
a 

2. Application of rules (I), (2) and (3) generating {0:2 < : s Z ? } an<i a; 
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(1) {x= s x}ttC',o 

(2) {x<: s t}I±IC',CT 

(3) {sf <:,#}\t}C,o 

(4) {a = s r}\t)C',o 

(5) {:= s a}yC,o 

(6) {sf a,sf <:.v a}WC",a 
(7a) {a <:. s sf,a <: iS . sf }tt)C',a 
(7b) {a <:.s ; sf,a <: s sf }i±)C',a 

(8) {Ti <:, T 2 ,T 2 <:, Ti}tt)C',a 

(9) {ai <:j a,a <:. v a 2 }tt)C',a 

(10) {s g <:. s a, a <: s ai}i±)C',a 

(11) {ai <: 4 a, a <: s s g }^SC',a 

(12) {sf <: s a,a<: s sf }fcfc)C",(7 

(13) {a<:,T}ac',(j 

(14) {t<:,. a}wC,a 



C',a 
C',a 

c',aifsf <: s sf- er 

[a i — y x]C, {a i->- t} u a 
[a i — ^ t]C", {a i->- t} u a 

{s ? <: s a} UC'.CT if 3s. (sf <:, s ? G T A if <:. v s ? G T) 
{a<: s sf }uC,aif (sf 1 <-. s sf er) 
{a<: s sf }uC,aif (sf <: s sf er) 

{n =,T 2 }uC,a 

{ai <:. s a 2 } U [a h4 a 2 ]C, {a i-> a 2 } U a 
{s g <: s ai}U[ai-> ai]C',{a H-aijUcr 
{ai <:. v ^}U[aH> ai]C',{a H-aijua 
[a i ^ sf ]C, {a ^ sf} u a if sf 1 <:, e r 

C',{a^T}uaif a<£f(C) 
C',{a^T}uaif a^r(C') 



Figure 7: Constraint resolution rules in context T. 



J. Application of rule (13) generating and {a 2 ^Z'JUtJ, algorithm then stops and returns o 
providing a substitution for all type variables in the deduction tree of£(x*,y,z*) -^[a,] £(one()) — > 

(y). 

4 Properties 

Since our type checking system and our type inference system address the same issue, we must check 
two properties. First, we show that every typing judgment that can be derived from the inference rules 
also follows from the checking rules (Theorem I4.2I ). in particular the soundness. Then we show that 
a solution given by the checking rules can be extended to a solution proposed by the inference rules 
(Theorem[4~4]>. 

Definition 4.1 (Solution). Let T be a context and e a term. 

• A solution for (T,e) is a pair (a, 7\) such that oT h oe : T\, where T\ S S ! L){wt}. 

• Assuming a well-formed sequent The: x»C, a solution for (F,e,X,C) is a pair (g,T 2 ) such that 
a satisfies C and OX <: s T 2 , where T 2 G 2$U{wt} and X £ ^ ype (^U{wt}, f). 

Theorem 4.2 (Soundness of constraint typing). Suppose that T \- ct e : x»C is a valid sequent. If (a,s 8 ) 
is a solution for (T,e, X,C), then it is also a solution for (T,e) (i.e. e is well-typed in T). 

Proof. By induction on the given constraint typing derivation for T h rf e : T»C. We just detail the most 
noteworthy cases of this proof. 
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Case CT-Elem: e = £(a\, . . . ,a n ,a) X = a 

rh rt £(ai,...,a n ) : a»Ci F\- Ct a: CL\ «C 2 

C = Ci(JC 2 ll{a= s s e 2 ,a i <-. s s\} 

We are given that (<7, s g ) is a solution for (F(£ : (s\)* — )■ 4)> e ) « 5 C), that is, a satisfies C and aa <:. v 
s 8 . Since (a,s g ) satisfies Ci andC2, (<7,<7Cu) and (a, aai) are solutions for (F,£(ai,. . . ,a n ),GC,C\) and 
(r,a,ai,C 2 ), respectively. By the induction hypothesis, we have o~F h o(£(ai,. ..,a n )) : aa and oT h 
oa : ooci. Since aai <: s s\,by Sub we obtain oTh : sj. Since a a = s\,by T-Elem we obtain o(F(£ : 
(s\)* — )■ 4)) l~ a(^(ai, . . .,a n ,a)) : s 2 . By Sub we obtain a{F{£ : (s\)* — > s f 2 )) h a(£(ai,. . . ,a n ,a)) : s 8 , 
as required. 

Case CT-Merge: e = £(a\,. . . ,a n ,a) X = a 

r\- a £(ai,...,a n ) : a»Ci rh c( ai : a»C 2 
C = CiUC 2 U{a=v4} 

We are given that (o~,s 8 ) is a solution for (F(£ : (jj)* — )■ <X,C), that i s > CT satisfies C and aa <:. v 
j g . Since (a,j g ) satisfies Ci and C 2 , (a,aa) and (a,aai) are solutions for (F,£(ai,...,a n ),a,Ci) and 
(r,a,a,C 2 ). By the induction hypothesis, we have oT h a(^(ai, . . . ,a ra )) : aa and ar h aa : aai. 
Since aa = s 2 , by T-Merge we obtain c(F(£ : (s\)* — > s 2 )) h a(£(ai, . . . ,a n ,a)) : s 2 . By Sub we obtain 
a{F{£ : (s\) —> s 2 )) h a(£(«i, . . . ,a n ,a)) : s g , as required. 

Cow CT-Match: e = ai H<![ Tl ] a 2 T = 

r h c; : ai • Ci r h rt a 2 : a 2 • C 2 

C = CiUC 2 U{ai <:, Ti,a 2 =, Ti} 

We are given that (o~,wt) is a solution for (F,e,wt,C), that is, a satisfies C and aw? <: v wt. Since 
(a,wf) satisfies Ci and C 2 , (a,aai) and (a,aa 2 ) are solutions for (F,ai,a\,Ci) and (r,a 2 ,a 2 ,C 2 ), 
respectively. By the induction hypothesis, we have ar h aai : aai and ar h aa 2 : aa 2 . Since aai 
aTi, by Sub we obtain arh aai : aTi. Since aa 2 = ax\, by T-Match we obtain oF\- o(a\ -Kfa] ai) '■ 
wt, as required. □ 

Definition 4.3 (Normal form of typing derivation). A typing derivation is in normal form if it does not 
have successive applications of rule [Sub]. 

Theorem 4.4 (Completeness of constraint typing). Suppose that K = F \- ct e : X»C. Write V(n)for the 
set of all type variables mentioned in the last rule used to derive % and write (j\V(7t) for the substitution 
that is undefined for all the variables in V(%) and otherwise behaves like a. If (<7,s 8 ) is a solution for 
(F,e) and dom(o) Pi V (ll) = 0, then there is some solution (a\s 8 )for (F,e, X,C) such that (j'\V(7t) = O. 

Proof. By induction on the given constraint typing derivation in normal form, but we must take care with 
fresh names of variables. We just detail the most noteworthy cases of this proof. 

Case CT-Elem: e = £{a\, . . . ,a n ,a) x = a 

TZ\ = F he £{a\ , . . . ,a n ) : a • C\ 7T 2 = F \- a a : ai • C 2 

C = CiUC 2 U{a= v 4) o; i <'-s s \} v ( n ) = 
sort0f(r,a) 
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From the assumption that (o,s g ) is a solution for (F(£ : (s\)* — > s 2 ),£(ai, . . .,a n ,a)) and dom(o) n 
V(n) = 0, we have o(F(£ : (s\)* —> s 2 )) \- o(£(a\,. . . ,a n ,a)) :s g . This can be derived from: 1) T-Merge, 
2) T-Elem or 3) Sub. In all those cases, we must exhibit a substitution o 1 such that: (a) o / \V(n) agrees 
with a; (b) a'a <: s s g ; (c) a' satisfies C\ and C 2 ; and (d) a' satisfies {a = s s 2 ,ai <: s s\}. We reason by 
cases as follows: 

1. By T-Merge we assume that s 8 = s 2 and we know that oF\- o(£(ai,. . .,a n )) : s 2 and aTh oa : s 2 . 
But since we cannot find a type s 3 such that s f 3 <: s s 2 , oF h oa: s f 2 cannot be derived even from 
Sub. Thus T-Merge is not a relevant case. 

2. By T-Elem we assume that s g = s 2 and we know that cT h o(£(a\,. ..,a n )) : s 2 and oF\- oa: s\. 
By the induction hypothesis, there are solutions (oi,s 2 ) for (F,£(a\, . . . ,a n ),a,C\) and (02, s\) 
for (r,a,a\,C 2 ), and dom(o\)\V '{%{) = = dom(o2)\V (112) ■ Define o' = {a 1— >■ , «i ^ j-}U 
(JUcJiU 02- Conditions (a), (b), (c) and (d) are obviously satisfied. Thus, we see that {o',s g ) is a 
solution for (F(£ : (s\) — > s 2 ),£(ai,. . . ,a n ,a),a,C). 

3. By Sub we assume that s 2 <-. s s g G T and we know that o(T(£ : (s\)* — )■ s 2 )) I - a(£(ai, . . .,a n ,a)) : 
s 2 . This must be derived from T-Elem, similar to case (2). 

Case CT-Merge: e = £(a\, . . . ,a n ,a) % = a 

K\ = r h rt , . . . ,a n ) : a • C\ n 2 = T \- ct a : a • C2 
C = CiUC 2 u{a=,4} y(?r) = {a} 

sortOf (r,a) = s 2 

From the assumption that (o,s g ) is a solution for (F(£ : (s\)* — V s 2 ),l(a\,. . . ,a n ,a)) and dom(o) D 
V(k) = 0, we have o(T(t : (s\)* — )■ jf)) h <r(^(ai,. . . ,a n ,a)) : s 8 . This can be derived from: 1) T-Merge, 
2) T-Elem or 3) Sub. In all those cases, we must exhibit a substitution o' such that: (a) o\V(n) agrees 
with a; (b) a'a <:. s s g ; (c) a' satisfies Ci and C2, and (d) a' satisfies {a = s s 2 }. We reason by cases as 
follows: 

1. By T-Merge we assume that s 8 = s 2 and we know that aTh o(£(ai,. . . ,a n )) : s 2 and oT\- oa : s 2 . 
By the induction hypothesis, there are solutions (o\,s 2 ) for (T,£(ai,. . . ,a n ),OC,Ci) and ( 02,-4) ^ or 
(r,a,a,C 2 ), and dom(o\)\V(ii\) = = Jom(a2)\V(7r 2 )- Define u' = {a4 4) U aUai Ua 2 . 
Conditions (a), (b), (c) and (d) are obviously satisfied. Thus, we see that (o',s 8 ) is a solution for 

(F(£: (s\)* ^s e 2 ),£(a u ...,a n ,a),a,C). 

2. By T-Elem we assume that s g = s 2 and we know that or h o(£(a\,. ..,a n )) : s 2 and arh oa: s\. 
But, because of the application condition of T-Elem, we cannot find a type s\ for oa such that 
s\ <: s s\, oT h oa : s\ cannot be derived from Gen. Likewise, since we cannot find a type s f 3 
for oa such that s e 3 <: s s\, oT \- oa : s\ cannot be derived even from Sub. Thus T-Elem is not a 
relevant case. 

3. By Sub we assume that s 2 <: s s g G T and we know that o(F(£ : (s\) — )■ s 2 )) h o(£(a\,. . .,a n ,a)) : 
s 2 . This must be derived from T-Merge, similar to case (1). 

Case CT-Match: e = a\ -«[ Tl ] a 2 r = wt 

K\ = r h rt «! : ai • Ci 7T 2 = r \- ct a 2 : a 2 • C 2 

C = Ci uc 2 u{ai <:, Ti,a 2 =, ti} V(7r) = {ai,a 2 ,Ti} if ti e ^ 

V(7r) = {ai,a 2 }ifTi 
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From the assumption that (<J,wt) is a solution for (F,a\ -«[ Tl ] #2) and dom(a) r\V(n) = 0, we 
have oT h cr(ai -«[ Tl ] #2) : wt. This must be derived from T-Match, we know that oT h o~a\ : OX\ 
and oT h a«2 : cii. By the induction hypothesis, there are solutions (<7i,<7Ti) for (r,ai,cti,Ci) and 
(a2,<7Ti) for (r,fl2, «2jC2). We must exhibit a substitution a' such that: (a) a'\V(7r) agrees with a; (b) 
o'wt <-. s wt; (c) a' satisfies Ci and C2; and (d) o' satisfies {oci <: s Xi,a2 = s Ti}. Define a" = {a\ h4 
s 8 ,a 2 H-i s }U(TUff|U(J2, where s g e £F. Moreover, define a' = a" U {ti \-+ s s } if Ti e / and a' = a" 
otherwise. Conditions (a), (b), (c) and (d) are obviously satisfied. Thus, we see that (a',wt) is a solution 
for(r,(ai a 2 ),wf,C). □ 

The constraint resolution algorithm always terminates. More formally: 
Theorem 4.5 (Termination of algorithm). 

1. the algorithm halts, either by failing or by returning a substitution, for all C; 

2. if the algorithm returns o, then o is a solution for C; 

We can already sketch a proof of Theorem 14.51 following Pierce lfT5l . 

Proof. For partQ] define the degree of a constraint set C to be the pair (m,n), where m is the number of 
constraints in C and n is the number of subtyping constraints in C. The algorithm terminates immediately 
(with success in the case of an empty constraint set or failure for an equation involving two different 
decorated sorts) or makes recursive calls to itself with a constraint set of lexicographically smaller degree. 
For part[2j by induction on the number of recursive calls in the computation of the algorithm. □ 

5 Conclusion 

In this paper we have presented a type system for the pattern matching constructs of Tom. The system 
is composed of type checking and type inference algorithms with subtyping over sorts. Since Tom also 
implements associative pattern matching over variadic operators, we were interested in defining both a 
way to distinguish these from syntactic operators and checking and inferring their types. 

We have obtained the following: our type inference system is sound and complete w.r.t. checking, 
showed by Theorems I4.4l andl 4.2l This is the first step towards an effective implementation, thus leading 
to a safer Tom. However, we still need to investigate type unicity that we believe to hold under our 
assumptions of non-overloading and non-multiple inheritance. 

As we have considered a subset of the Tom language, future work will focus on extending the type 
system to handle the other constructions of the language such as anti -patterns [12, 13]. As a slightly 
more prospective research area, we also want parametric polymorphism over types for Tom: our type 
system will therefore have to be able to handle that as well. 
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